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Abstract 

We model two time and space scales discrete observations by using a 
unique continuous diffusion process with time dependent coefRcient. We 
define new parameters for the large scale model as functions of the small 
scale distribution cumulants. We use the non - uniform distribution of the 
observation time intervals to obtain consistent and unbiased estimators 
for these parameters. Closed form expressions for migration proportions 
between spatial domains are derived as functions of these parameters. The 
models are applied to estimate migration patterns from satellite tag data. 
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1 Introduction 



Statistical analysis of ecological complex systems [I] , [2] , financial data [B] or ge- 
netics [1] increasingly relies on stochastic models for data underlying processes. 
In addition, most cases require integration of several types of deterministic and 
stochastic models |:5 . Presence of errors, with a priori unknown distribution 
makes estimation even more difhcult. 

In this paper, we propose a method of statistical inference at long time 
and large space scales when the available data consists of discrete observations, 
measured (with generally distributed errors) at non - identical time intervals and 
much smaller scale. For this purpose, we define meaningful process parameters 
and find corresponding unbiased and consistent estimators which can be used 
for inference. 

We are motivated by a specific type of data, consisting of multiple time 
series of spatial locations, with finite lengths, measured at unequal, finite time 
intervals. This is a typical structure for observations of complex ecological sys- 
tems with migration processes, such as observations from automatic positioning 
instruments recording location using GPS signals in data storage tags (DSTs). 
Aggregated counts on spatially extended domains, at given time intervals may 
be also available and need to be simultaneously used. In either cases, one is in- 
terested in predictions of migration proportions between spatial domains, over 
large time intervals, for ensembles of possibly non-identical individuals of the 
given system. 

Models of population dynamics quickly become analytically infeasible and 
this is why numerical approaches abound, some even with little theoretical jus- 
tification. Detailed multispecics models of population dynamics commonly need 
to include spatial structure to describe temporally variable species overlap [5] 
and these can quickly become computationally infeasible. For example, models 
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with unknown temporally varying migration rates between several areas give 
obvious estimation problems, particularly in a niultispecies context. It is there- 
fore important to formulate migration in such a manner as to reduce the number 
of parameters, yet allow both flexibility and permit the incorporation of the mi- 
gration process into typical box models. In particular, in a complex framework 
such as a multispecies, multiarea Gadget model [H], 0, it is not feasible to 
incorporate a computational layer which requires numerical solutions to partial 
differential equations to describe migration. Rather, solutions in closed form are 
required to describe the migration processes. 

We assume that the observation scale underlying process is a fairly general 
diffusion [TU], [H] . The continuous model which is discretely observed may 
be regarded as the limit of a biased random walk (unobserved, at much smaller 
scale) with identical / non-identical steps, i.e. with constant or time - dependent 
drift and diffusion coefficients, respectively. If several spatial paths are observed, 
we assume same number of independent diffusions as underlying processes. 

Diffusion processes may be described in several ways. One can use the 
stochastic equation representation of the type drt — jSdt + DdBt, for general 
drift and diffusion (D) which may depend on time and on the process rj, and 
with Bt a Brownian motion. Wc will use the complementary representation, the 
partial differential equation (Kolmogorov - forward or Focker - Planck equation) 
which describes the evolution of the probability density function P(r, t) in time 
and space: 



Here, D is the diffusion matrix and /? - the drift vector, for a general d - dimen- 
sional case (r e R^). Note that higher order derivative terms could be included 
in equation (1) when considering more general models. 



dPjr, t) 



V(3P{r,t) + -V^T)P{r,t) 



(1) 
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The inference scale, much larger, will be characterized by the same underly- 
ing processes but different initial and boundary conditions for the equation (1), 
imposed by the ecological constraints, in our case. We give (section 2) closed 
form solutions for the migration proportions, which depend on newly defined 
large scale drift and diffusion parameters. 

Diffusion models have been frequently employed in modeling migration ([T] 
, [2]). Most of them rely on numerical solutions which would at least slow-down 
considerably any complex system analysis which involves several time scales and 
several deterministic and stochastic processes. Severe limitations related to such 
solutions can be avoided by using analytical approximations as provided in [12j 
for the case of one dimensional diffusion processes and Gaussian noise. 

By contrast, we use the non - uniform time - interval distribution as an 
advantage in calculating the cumulants of the long time large scale distribution 
of observations as a function of the smaller observation scale. This allows us 
to introduce what we call effective and collective models, parameters and their 
estimators (section 3). We illustrate our method with a real data example of 
migration, in section 4. 

2 Process modeling and main assumptions 

In this section we briefly review two typical solutions of the Focker - Planck 
equation (1), which will play key roles in the construction of statistical models 
in section 3, since they provide the distributions of the true (under the model) 
values of positions for a given time interval distribution. 

Although we allow for time dependence of the diffusion coefficient, we still 
make a series of simplifying assumptions: 

(a) the lengths of time intervals between observations, the measurement errors 
and the true positions are independently distributed; 
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(b) errors are independently and identically distributed (according to some gen- 
eral, non - Gaussian law). Errors and process are also independent. 

(c) the space - domain O is 2 - dimensional {d= 2). We will actually work only 
on rectangular domains, in order to give analytical solutions as far as possible. 
This can be generalized to more general geometries, but keeping closed forms 
for the results would require other assumptions. 

(d) the diffusion is homogeneous in space, the matrix D is diagonal with identical 
elements D{t) which may depend on time. 

(e) the link between the observed (r"*"*) and true (r,) values is given by a simple 
additive statistical model: r°^^ = n + Ci, where are measurement errors. 

(f) we define the distribution of errors e in terms of cumulants with fcf = 0, 
a diagonal fcj matrix, and possibly higher order cumulants A;^(eii, ...,ei^). The 
variance-covariance matrix for the error distribution is assumed to be diagonal 
and we choose for simplicity {{k2)jj = fg). 

2.1 Discrete observations scale 

We are motivated by position and time data recorded by satellite tags for mi- 
gration studies. They provide a large number of observations Tq'^, ...,r^^, at 
finite time - intervals to, ■■■,tn, for many finite paths 7 S F, where the set F is 
included the spatial domain O. 
The boundary conditions for equation (1) will be: 

P{r^±oo,t)^0 (2) 

since we assume that the boundary of the spatial domain fl is very "far" from 
any observed path. 

If D{t) is constant in time, the solution of Focker-Planck equation becomes: 
P(r, t) = J G{6r \ dt)Po{ro, to)dro. Here Po{ro, to) are initial conditions and the 
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transition density is a Gaussian in true (under the model) position differences 
6r = r — To corresponding to any given time intervals 6t = t — to : 



G{Sr I 5t) 



1 



exp 



AD5t 



) 



(3) 



Remark 2.1 

The Green function solution of the Focker - Planck equation with time - depen- 



This will allow us to solve both statistical inference problems (for constant and 
time dependent diffusion) in a very similar manner. 

2.2 Large scale counts and proportions 

The second type of observations we need to model are the counts on extended 
spatial domains (we will denote coordinates R G to distinguish from the finer 
spatial scale), at given (long) time intervals (AT). An example is provided again 
by migration studies (mark - recapture data), where classical tags are used and 
only aggregated counts can be recorded, at longer time intervals. 

Our main goal is to estimate migration proportions, i.e. the fraction of paths 
which start in a given spatial domain and end in an other domain, after a given 
time AT. We derive here the theoretical expressions of these proportions, as 
functions of process parameters and we will show in the next section how these 
parameters can be estimated. 

The same stochastic process is assumed to generate the true values. The 
same difl:crcntial equation for the probability distribution function has to be 
solved, but for different boundary conditions: 



dent diffusion coefficient D{t) is still a Gaussian: 



(4tt f D(t)dt)'i/^ 



1 



exp 



ij D{t)dt 




(4) 
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and with P(R, Tq) = constant on a given Oq as initial conditions. Here, the 
distances R = ||R|| are much larger than the typical distances r = ||r|| in 
previous section. These are particular conditions we chose in order to model 
the fact that the migrating individuals are not leaving a given habitat (f2), the 
distances between observations are comparable with the characteristic lengths 
(denoted Lx,Ly) of the domain fl and that the time intervals between two 
counting experiments are large enough for their distribution to become uniform 
on a given area. Note that in fact, these initial conditions are also the long time 
(AT >> 5t) limit of the solutions of the previous problem (subsection 2.1). 

The Green functions solution can be explicitly calculated, for arbitrary 
AR = Rf — Ri (with coordinates AX, AY), under the assumption (b): 

G(AR|AT) = G{AX\AT)G{AY\AT) (5) 

Here: 



and: 



G(AX|AT) = E„(/in+/2„) (6) 
G{AY\AT) = En{lL + lL) 



V47ri5(Ar)' H 4Z?(AT) ' 



= TOW I ^dW) ) 

Analogous expressions, for Y - coordinates, Ly dimension and (iy, correspond 
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The model provides now the migration proportions Wi / defined by the frac- 
tion of paths which start in a given area (Ai) at time Tq and are found in an 
area (Aj) after a time T. Let us denote , Y^'^ , Xf, Y/^ and , Yf , Xf 



Yj" the coordinates of upper - right and lower - left corners of two rectangular 
areas Ai and Af respectively. 

The initial conditions on the initial large area are given by a uniform dis- 
tribution at time Tq. This is consistent with the long time limit of small scale 
solutions. 

Proposition 2.1 

The proportions Wif are given by: 

_ lAfdafh^da,G{1^j-R.,\AT) 

which, due to ([5]) becomes: wif = w'fjw\j, with wf^ = rvf^/n^ and: 

x^ 

nff= I ' dXf I ^ dX, V -I- /2„) (10) 
J^f Jxt 

Similar expressions can be written for w^f. 

Each term in the sum (fTU]) has a tractable form. We give as an example 

nff - E„ 111 - Ii2 + I21 - I22, where: 

, vU 



I22 = F{Xf + Xl - P^AT + 2nL^) - F{Xj + X^ - + 2nL^) 

I12 - ^(^y - - P^AT + 2nL,) - F{Xf - X^ - jS^AT + 2nL,) 
= FiXy - XY - P,AT + 2nL,) - FiXf - - /3,AT + 2nL,) 



and: F{z) = F{~z) ^ \{~z ■ erf{S) + exp (-P)) for S = ^^^z. 

Similar formulae can be written for the Oy terms, thus the proportions Wif have 
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closed expressions. Equation Q is just the definition of the probability of finding 
the final position in Af, given that the initial position lies in Ai, normalized by 
J^daf duiGCRf — Hi \ AT) since the solution of Focker - Planck equation 
does not necessarily integrate to 1 for arbitrary boundary conditions (|3]). The 
next steps above are just elementary calculations. 

We will show in next section that there exist meaningful parameters for the 
drift and diffusion coefficient used in calculating the proportions Wif, and that 
they can be estimated consistently and without bias. 

3 Statistical model 

In this section, we will define the joint distributions of the observed positions. 
We then identify meaningful parameters of the resulted statistical model and 
find estimators which can be used for long time and large scale inference. 

Due to the separability of the solutions in section 2.1, we can restrict the 
present calculations to one - dimensional case. One can easily check the validity 
of two - dimensional generalization of these results. 

3.1 The model parameters 

Under the stochastic model previously described (section 2.1), for any given Sti, 
each Sxi is normally distributed, with cumulants: ki^^^ — (35ti, /c2(i) = 2D5ti if 
D is constant, or fc2(j) = 2 J^**' D(t)dt when D depends on time. 

The joint distribution of errors (ei,...,e„) has a null vector as mean and 
diagonal higher order (tensor) cumulants, since the errors are i.i.d. 

Each difference 6ei has a distribution defined by the following cumulants: 

— ^' '^2{t) — '^'^21 '*2p+l(i) ~ '^2pit) — ^'*2p: lOr p — 1,Z, .... 

However, their joint distribution will have non-diagonal higher cumulants. 
For example, -^^^ = fc|. 
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Although the joint multivariate distribution of {5xi, ...,5xn) has a diagonal 
second order cumulant (the variance - covariance matrix) and all higher order 
(tensor-) cumulants are zero, this is not the case for the joint distribution of 
the observed values {Sxf'^, ...,6x^^). The first cumulant of this distribution is 
the same vector which is the cumulant of the true (under the model) values, 
i.e. ki = ((3Sti, f3Stn). The non-null elements of the variance - covariance 
matrix are given by k^°^^^^ = k2(i) + 2fc^, fc2°(M±i) + (^2(i) - ^2(i±i))- The 

elements of higher order cumulants depend only on the error distribution and 
can be straightforwardly calculated. An useful example is given in the following 
property. 

Proposition 3.1 

The joint cumulants of the type ka{Sx°^^ , ...,6xj'"^, ...), where appears Ui 

- times and 6xj''^ appears (a — aj) - times, when a > 2 satisfy the following 
relations: 

ka{5xf^, fef ...) = (1 + (-!)«) • kl (11) 

if I = J, kaiSxf ', fof ...) = i-r^kl for J =1-1, kaiSxf', 6xf', ...) = 
(— )"'+! /c^ for j = i + 1 and zero otherwise. The proof is given in the Appendix. 

Remark 3.1 

The available data consists of observations .Tq^''*, .t"*"^, at tQ,...,tn, i.e. 
Sx'i'^ , 6x'^^ corresponding to (5ti, (5t„. The time intervals are thus dis- 
tributed according to some (discrete) probability density (with weights p{Sti)). 
This is however an advantage for our purposes, since it implies that, in a long 
time AT — > oo, any given Sti is sampled n, - times, with n, = np{6ti), where n 
is the total number of intervals = AT. Therefore, for each distinct Stt, 

a number of rij values of 5x are sampled from a common distribution with first 
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and second cumulant depending on Sti. 

As a consequence, for large enough values of Ui, one could estimate the 
individual mean and variance of each 6x°'"' distribution. However, our goal being 
statistical inference over long time AT, we need to estimate the parameters of 
the AX = J2i 5xf"^ distribution. We give the theoretical expressions of these 
meaningful parameters in what follows. 

Definition 3.1: For a given discretely observed (with noise) diffusion process 
over 5ti = AT, the effective drift parameter [3ef / is defined by: 

/3e//E'^^^ = E^iW (12) 

i i 

This implies: 

f3effY^p{St,)5t, = J2piSU)k^,) (13) 

i i 

where the sums ^* are over distinct values of Sti. In the continuum limit of 
the time interval distribution, the equation (jl3p becomes: 



_ J k,[stwt)d{st) 

J{dt)pidt)d{St) ^ 

We indicate the dependence on time interval of any cumulant by using the 
notation fca[<5i]. 

Definition 3.2: For a given discretely observed (with noise) diffusion process 
over 6ti = AT, the effective diffusion parameter -De// is defined by: 

2Deff 5ti = k2 (AX, AX) (15) 

i 

where AX = ^x"^" ■ We can make explicit the second cumulant of AX 
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distribution , in terms of Sxf"' - second cumulants: 

fc2(AX, AX) = ^^fc2(fef^fef^) (16) 

i j 

which after simple processing gives: 

fc2(AX, AX) = ^ ^2(0 + 2fc^ = ^ n,fc2(,) + 2fc^ (17) 

i i 

where again, the sums ^* are over distinct values of Sti. The first equality was 
obtained by using the properties of the joint distribution of {Sxf"' , ...,Sx'^'') 
mentioned at the beginning of this section: J2j ^2{5xf'^ , 5x°^'^) — 
E7=2 {k2iSxf',Sxf_\) + k2iSxf',Sxf') + k2iSxf',Sx°l\))+k2iSxf',6xf')+ 
k2{Sxf',6xf') + A:2(fe°^^<5<''^) + fc2('5<^", '^<'!f i) = ELi fe(fe„<5x,) + 2fc| 
The resulted equation for D^jf is then: 

2D,ffJ2piSt^)^t^ = ^P('^i.)fc2(^) + 2n-iA:^ (18) 

i i 

In the continuum limit of the time interval distribution (large Ui and large n, 
small 5t), the equation (fT8|) becomes: 

2/(<5t)p(<5Wt) ^ ^ 

The error term in (|18p is 0(n^^), and, provided fc| is finite, does not contribute 
to the continuous limit above. 

Remark 3.2 

It is easy to check that if D is constant or the distribution of the time intervals 
is uniform, the effective parameter coincides with the constant value or the 
integrated / D{t)dt, respectively. 
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Remark 3.3 

The distribution of large scale values AX is not completely specified by the first 
two cumulants, and the corresponding Focker - Planck equation should contain 
higher order spatial derivatives. However, keeping only the first two terms is a 
reasonable approximation, since we can easily check that kz{^X, AX, AX) = 0, 
and only at the forth order we obtain fc4(AX, AX, AX, AX) = 8A:|(2n- 1), thus 
a forth order derivative with a 8A;| - coefficient in the generalized Focker - Planck 
equation. 

In applications we can encounter the problem of observing many diffusion 
paths which are not necessarily generated by the same stochastic process, i.e. 
with the same drift and diffusion coefficients. For example, in ecological systems 
models, the paths correspond to different individuals which can have different 
behaviour. However, if the goal of statistical inference is long term and large 
space scale predictions for the ensemble of diffusion processes (the group of 
individuals), we can define new parameters which will describe this ensemble. 
We will call them collective drift and diffusion coefficient. 

Let I3lfj, Dljf, ATT, fci(AX)T, A:2(AX,AX)'' be the parameters, char- 
acteristic long time and large distance and cumulants of any path 7 from an 
arbitrary set F. 

Definition 3.3 

The collective drift and diffusion coefficient for the ensemble F are defined by 
the equations: 



PI 



collect 



(20) 



2DI 



collect 



.Er{AT"') = Er{k2{AX,AXy) = 2Er{D2jfATr) 



(21) 
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where the expectations Et{P) are calculated over the ensemble of paths. 



3.2 The estimators 

Wc propose now consistent and unbiased estimators of the effective and collec- 
tive parameters defined in the previous subsection. 

Proposition 3.2 

For each distinct 5ti, let {5xf'^)'^'^\ a = 1,...,N, be N values sampled from a 
distribution described by its cumulants a = 1,2.... Here, A'' can be the 

actual number of available observations (n^) or a number of values obtained 
by re-sampling with replacement from each such a group. Assume k°^^^ = 0, 
to simplify notations. For non-null means, one needs to "center" the observa- 
tions (subtracting the estimated means), but all properties remain valid in what 
follows if that is the case. 



Denote by Sx"''^ the sample average of {6x°''^)^"h The estimator of the effective 
drift parameter given by: 



is consistent and unbiased. 



Proposition 3.3 

Under the same assumptions as in Proposition 2, a consistent and unbiased 
estimator of the effective diffusion parameter is given by: 



The prove of both propositions is given in the Appendix. 

In a similar manner, averaging over paths will give estimators for the coUec- 
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tive parameters, which in turn can be used in estimating the migration propor- 
tions as derived in section 2.2. Re-samphng (with replacement) methods can 
easily provide confidence intervals for either effective (when re-sampling obser- 
vations of each path) parameters or collective ones (when re-sampling in two 
stages, at path and observation level). 

Remark 3.4 

The relation between the observed large scale and true (under the model) 
cumulants is additive, so knowing the error variances fc| allows us to determine 
the later from ^ and (|23l). 

Remark 3.5 

The estimators we propose are fundamentally different from the ones derived 
in literature (see [3] for a good review). Usually, the variance or the integrated 
variance, when time dependence is allowed) of the observed positions is usually 
obtained under the assumption of uniform time - interval distribution or, if not, 
by relying on Taylor expansions. In all these cases, the estimator (or its first 
approximation) k2{AX, AX) was of the type We exploit the 

time interval distribution and the correlation structures in a different manner. 
In the particular case of uniform distribution one may easily check that we 
recover the known estimators. 

4 Case study 

In this section, we apply our proposed model to a real data set which consists 
of locations recorded at finite random time intervals by satellite tags attached 
to 19 hooded seals. The hooded seal (Cystophora cristata) is a key pinniped 
species in the Greenland and Norwegian Seas. 

The distribution and behaviour of these animals have been studied [13] , [H] 
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by tagging a group of seals with satellite - linked platform terminal transmitters 
(PTT) on the sea ice near Jan Mayen. A total of 12,834 locations were deter- 
mined during an overall tracking period of 3,787 seal days, and their range was 
very vast: from 54 N to 84 N, and from 41 W to 16 E. 

In figures 1, 2, 3, we give examples of the migration paths of 3 seals and the 
empirical distributions of: the distances along Ox (absolute value of longitude) 
and Oy (latitude) - axes, the lengths of the time intervals between measurements 
and the centered and scaled observations. 

Two models were proposed for this data. An effective, individual model, 
which provided estimated diffusion parameters for each seal, and a collective 
one, which gives estimates of the parameters characterizing the whole group. 

The models were also tested against each other, in order to decide which 
one is more appropriate for statistical inference. A test based on the asymp- 
totic approximation for the distribution of these parameters gave non-significant 
differences between individual and collective parameters. The same conclusion 
is illustrated by the qq - plot in figure 5 which is obtained from the empirical 
distributions under the two models generated by re-sampling with replacement. 

The main stochastic effect seems to be due to pure diffusion, the collective 
drift is very weak {{(3A.Tf « 2D AT, for AT of the order of 3, 4 or 6 months). 
This is in accordance to previous observations which indicate that hooded seals 
do not display any general seasonal migration pattern. 

5 Conclusions 

In this article, we have modeled two types of discrete data (observed at two 
space - time scales) by a unique diffusion process. This allowed us: (i) to derive 
consistent and unbiased statistical estimators for large scale model parameters 
as functions of small scale observations, but also (ii) to express large scale quan- 
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titles of interest (like migration proportions) in terms of a minimal number of 
parameters. 

We have applied this procedure to a migration data set, where only small 
scale discrete observations were available. 

In addition, since the methods described here give a flexible description 
of (although are not restricted to) migration processes and have closed-form 
solutions, they can be readily incorporated in complex models of population 
dynamics, such as models implemented in Gadget [7, or similar modelling envi- 
ronments. 

6 Appendix 

Proof of Proposition 3.1: 

Let Sxf"^ — xf'^ — x°'^^, for i — 1,2, ...n, where n is the number of samples. 
Then Sx°'"' — Xi + ei — Xi^i — ei_i. Let Xi be distributed as ([3]) and let Ci be 
i.i.d., i.e. ka{ei, ...,ei) — k^, where ka{ei, ...,ei) stands for the joint cumulant of 
a terms e^. 

The joint cumulants ka{6x°'"' , ....5x°'^'^ , ...), where the total number of terms 
is a, can be calculated by using multi-linearity properties of cumulants. Any Xi 
and ei are independently distributed, so any joint cumulants involving this type 
of terms cancel. The small scale distribution of the true positions is a Gaussian, 
so all higher cumulants (a > 2) of the type ka{Sxi, ....Sxj) are zero. 

Therefore, when a > 2, we are left with terms of the type: ka{ei — £i^i, £j — 
Cj-i,...). This can be written as: ka^Ci, ...,ej, ...) +{—)"'^ka{ei-i,-.-,ej,...)+ 
{-Y'ka{e^, ...,ej-i, ...) +(-)"fca(e^-i, Ej-i, ■■•)• 

We denoted by e^, ... a number of e^'s and similar for j, i — 1 and j — 1. 

When j — i, only the first and last contributions are non-zero, so that: 
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= + (—)°'kl. When j = i — 1, only the second type of contributions do not 
canceh {-)"■' ka{ei^i, ...,ej, ...) ^ {-)'''ka{e, ...e) {~)'^'kl while for j i + 1 
we obtain: {-)"'^+^ka{e„ ...,e,+i, ...) = i~)'''+'kaie, ...e) = {-T^+'kl 
Note: 

For a = 2 we obtain fc2(faf ^ fef = k2{6x^, Sx^) + 2fc| = 2 f*^_^ D{t)dt + 2fc| 
and fc2(^a;f^ (Sxf = -fc^ + 2 ///;^^ i:>(t)di for j = i±l. 

Proof of Proposition 3.2: 

(i) the estimator (|22p is unbiased: 

(ii) the estimator ((^H) is consistent: 

/^e// Ei = E ^i(i) and (see [IS]) each fci(j) ^ /c^^). 

Proof of Proposition 3.3: 

(i) the estimator (|23p is unbiased. 

moeff -^^0 = EiE, ^^Ffef ) - E. E, Eikx^,) ) = E. E, ^^2,^) = 

2^e// Ei 

(ii) the estimator ([23|l is consistent: 

2^e// Ej (^^i = Ei Ej ^2,(y) and (see ^ ) each fc2,(ij) ^ fc2,(y)- 
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9874 



Histogram of dx 



Histogram of dy 




Figure 1: Example of observed path (9874), empirical distributions of: distances (dx 
and dy) along Ox and Oy respectively, time - intervals (dt) between observations, 

centered and scaled observations (brownx and browny respectively) along Ox and Oy 
(where Ox corresponds to absolute values of longitude and Oy - to latitude values). 
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Figure 2: Example of observed path (9665), empirical distributions of: distances (dx 
and dy) along Ox and Oy respectively, time - intervals (dt) between observations, 

centered and scaled observations (brownx and browny respectively) along Ox and Oy 
(where Ox corresponds to absolute values of longitude and Oy - to latitude values). 
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Figure 3: Example of observed path (9669), empirical distributions of: distances (dx 
and dy) along Ox and Oy respectively, time - intervals (dt) between observations, 

centered and scaled observations (brownx and browny respectively) along Ox and Oy 
(where Ox corresponds to absolute values of longitude and Oy - to latitude values). 
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Oy residuals, model 2 
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Figure 4: Empirical distributions of centered and scaled observations under effective 
model and collective model 
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Ox qqplot 



Oy qqplot 
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